Individual Poster Page

See copyright notice at the bottom of this page.

June 27, 2002 - Ben Vollmayr-Lee

Nice work Tango.
I missed where the confidence levels come in. But in reponse to Gerry's comment: for applications where you are determining the confidence level of the mean (say, average OPS of 30 year olds), you do not need the distribution of OPS to be normal. The reason is the "central limit theorem" which basically says that means are normally distributed (once the sample is big enough) even when individual variables are not. This is a very powerful theorem and is basically the foundation of the field of statistics.

Aging Patterns

July 4, 2002 - Ben Vollmayr-Lee

Okay, there's a lot of confusion here on the BA distribution question, and I contributed to it, so let me try to straighten it out. Pulling no punches (but all meant in good fun), and going chronologically
Tango's confusion: his poll example was a perfect use of the central limit theorem. But his application to how a batter's BA over 600 AB is related to their 'true' BA ability wasn't perfect. If the batter has the same chance of a hit in each of their AB then the CL theorem would apply. But the batter doesn't. As it turns out, neglecting this issue and blindly applying the CL theorem is a reasonable thing to do anyway, but it requires some motivation and is not explained by the poll example. Bad Tango.
Gerry's confusion, part I: might not have existed. He asked "Do you know that batting averages ... follows a normal distribution?" If he was thinking that the issue is whether the distribution of batting averages around the league is normal, then he missed Tango's point. But if he was thinking "do you know that a player's actual 600 AB batting average is distributed normally around their 'true' batting average?" then he was asking exactly the right question.
Ben's confusion: I thought Tango was applying confidence levels to, say, the average BA of all 27 year olds. Here the CL theorem would apply: the average BA taken from N players is a normal distribution provided that N is big enough. I qualified my statements correctly, but of course this had nothing to do with Tango's argument, so my reply to Gerry was a non-sequitor and way purple.
Gerry's confusion, part II: instead of calling me on my non-sequitorarity (see, I like pizza), he questioned the hypotheses of the CL theorem. For what it's worth: the conditions for the CL theorem to apply are simply that the original distribution needs to have a non-infinite standard deviation, and N independent samples have to be drawn from this distribution, with N "big enough" (for realistic applications, 20 gets you in the ballpark and 100 is damned accurate). The problem with Tango's use of the CL theorem with BA is that the samples are drawn from different distributions: a batter's chance of a hit against Randy Johnson is different than it is against Livan Hernandez. And Bob Brenly is happy about this.
Ed's confusion: probably induced by the mess that preceeded him, he thought the issue was distribution of batting averages around the league.
I hope this is taken as lightly as intended and no one is offended. Or failing that, I hope I offended everyone equally so as to be democratic about it. Now for the argument why Tango's analysis of confidence levels for BA is reasonable:
Instead of having 600 AB with a chance 'p' of getting a hit each time, a more realistic model is that a batter has some 20 AB with a chance 'p1' and another 35 AB with a chance 'p2', another 20 with a chance 'p3', etc. (these could come by ranking the toughness of the pitchers and throwing in a Coors factor, for example). The CL theorem can be applied to each of those sub-samples, with each of their 'p' values and N, but that doesn't tell you yet much about what to expect for the combined average. However, two things work to your advantage:
1) the std dev of the 2-case distribution (hit or not-hit) is not strongly dependent on the probability 'p' away from the extremes of 0 and 1. The std dev is sqrt(p*(1-p)), so you get .433 for p=.25, and .477 for p=.35. They fall within 10% of each other, and if your end conclusion is that a .300 hitter is .300 +- .010, you're only working to 10% accuracy simply by rounding the .010 to the third decimal place. So we can approximate the std dev for each of the sub-sample means by just using the overall rate 'p' (Note that if the p1, p2, ... vary between .25 and .35, you're doing much better than 10% accuracy when you average together the subsamples, because both end points are within 5% of the middle. Even to get 5% accuracy you would need all cases to be extremes, in reality they are spanning the spectrum and even more likely to be in the middle, so we're talking 1-2% accuracy)
2) these different distributions with their own p1, p2, etc, have N's much smaller than 600 AB in their sqrt(p*(1-p)/N) (the std dev of their AVERAGE), so how can we justify using sqrt(p*(1-p)/600) as the std dev for the combined average? The reason is because each sub-sample average is independent of the others, so when combining them the errors tend to cancel to some degree. This is the same effect that makes the CL theorem work to begin with, we're just doing a more complicated version of it this time.

Baseball: Pythagorean Method (February 11, 2004)

Discussion Thread

Posted 10:44 a.m., February 11, 2004 (#3) - Ben Vollmayr-Lee
Can anyone point me towards Pythagopat?

Baseball: Pythagorean Method (February 11, 2004)

Discussion Thread

Posted 10:51 a.m., February 11, 2004 (#5) - Ben Vollmayr-Lee
Is the 'Pat' short for Patriot? I assume this form gives a lower RMS than the others? It must be a small effect, though, because RPG^0.28 is quite linear in the range of historical run environments.

Baseball: Pythagorean Method (February 11, 2004)

Discussion Thread

Posted 5:25 p.m., February 11, 2004 (#9) - Ben Vollmayr-Lee
Greg: I will be leaving this on my web page indefinitely, so it's probably safe. Though I have no problems with Primer archiving it (though maybe I should finish formatting it first). To get to a lower RMS I'm pretty sure you would need to account for more information than just RS and RA.

Patriot: thanks for the description. So RPG^x has a feature like pythagoras, that it hits an extreme properly. But, as David already said, this is a lesser concern than fitting the data well in the typical RPG range.

Evidently RPG^0.28 does well as judged by RMS (presumably what Tango showed). Which is interesting, because it's a 1-parameter fit, so it's 'fitting with one hand tied behind its back'. Interesting. I don't have my data set up at the moment to conveniently run it through and find the RMS.

Baseball: Pythagorean Method (February 11, 2004)

Discussion Thread

Posted 3:03 p.m., February 12, 2004 (#11) - Ben Vollmayr-Lee
Pythagopat revisited: In the old post I had tried a formula of the form

wpct = 0.5 + a RPG^b (x-1/2)

I didn't report the fit values, but they turn out to be a=0.933765, b=0.309298 (linear least square fitting). This gave the RMS of 4.192 reported. If I go back and fit with

wpct = 0.5 + RPG^b (x-1/2)

I get b = 0.277742 with also RMS 4.192. This is a good sign that the 1-parameter fit is better than the 2-parameter fit. Now if I pythagorize 'n', i.e. take

n = a RPG^b

in the pythagorean formula, I get a=0.94935, b=0.310016, with RMS 4.185. Going all the way to Pythagopat, n = RPG^b, I get b=0.286089 with RMS 4.184.*

Two comments:

1) these RMS values are right in line with the RMS obtained from other functional forms that had some form of diminishing returns (pythagoras or cubic correction) and a RPG-dependent slope.

2) Pythagopat, in being constructed to behave reasonably in the RPG=1 limit, matches the performance of 2 parameter fits with only 1 parameter. This lends a some merit to theoretical idea behind pythagopat. The way I would put it: the rules of baseball (no ties) are evidently one of the main sources for the RPG-dependence in wpct formulas. And this is pretty cool. I learned something new.

* you might be troubled that RPG^b can fit to a lower RMS than a*RPG^b. The reason is that what I'm reporting as RMS (root-mean-square) has in the denominator of the mean (average) NOT the number of team-seasons in my sample (1972) but rather the number of team-seasons minus the number of fitting parameters. So the one-parameter fit had 1971 in the denominator and the two-parameter fit had 1970 in the denominator. This version of RMS is a better measure of the quality of your fit, and it tells us in this case that adding an extra parameter has no real improvement on the fit.

Copyright notice

Comments on this page were made by person(s) with the same handle, in various comments areas, following Tangotiger © material, on Baseball Primer. All content on this page remain the sole copyright of the author of those comments.

If you are the author, and you wish to have these comments removed from this site, please send me an email (tangotiger@yahoo.com), along with (1) the URL of this page, and (2) a statement that you are in fact the author of all comments on this page, and I will promptly remove them.